MatLM: a Matrix Formulation for Probabilistic Language Models
نویسندگان
چکیده
Probabilistic language models are widely used in Information Retrieval (IR) to rank documents by the probability that they generate the query. However, the implementation of the probabilistic representations with programming languages that favor matrix calculations is challenging. In this paper, we utilize matrix representations to reformulate the probabilistic language models. The matrix representation is a superstructure for the probabilistic language models to organize the calculated probabilities and a potential formalism for standardization of language models and for further mathematical analysis. It facilitates implementations by matrix friendly programming languages. In this paper, we consider the matrix formulation of conventional language model with Dirichlet smoothing, and two language models based on Latent Dirichlet Allocation (LDA), i.e., LBDM and LDI. We release a Java software package– MatLM–implementing the proposed models. Code is available at: https://github.com/yanshanwang/JGibbLDA-v.1.0-MatLM.
منابع مشابه
Learning Distributed Representations for Statistical Language Modelling and Collaborative Filtering
Learning Distributed Representations for Statistical Language Modelling and Collaborative Filtering Andriy Mnih Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2010 With the increasing availability of large datasets machine learning techniques are becoming an increasingly attractive alternative to expert-designed approaches to solving complex problems in domai...
متن کاملImplicational Scaling of Reading Comprehension Construct: Is it Deterministic or Probabilistic?
In English as a Second Language Teaching and Testing situations, it is common to infer about learners’ reading ability based on his or her total score on a reading test. This assumes the unidimensional and reproducible nature of reading items. However, few researches have been conducted to probe the issue through psychometric analyses. In the present study, the IELTS exemplar module C (1994) wa...
متن کاملPreparation of Sustained-Release Matrix Tablets of Aspirin with Ethylcellulose, Eudragit RS100 and Eudragit S100 and Studying the Release Profiles and their Sensitivity to Tablet Hardness
A sustained-release tablet formulation should ideally have a proper release profile insensitive to moderate changes in tablet hardness that is usually encountered in manufacturing. In this study, matrix aspirin (acetylsalicylic acid) tablets with ethylcellulose (EC), Eudragit RS100 (RS), and Eudragit S100 (S) were prepared by direct compression. The release behaviors were then studied in two co...
متن کاملRandom Matrix Approach: Toward Probabilistic Formulation of the Manipulator Jacobian
In this paper, we formulate the manipulator Jacobian matrix in a probabilistic framework based on the random matrix theory (RMT). Due to the limited available information on the system fluctuations, the parametric approaches often prove to be inadequate to appropriately characterize the uncertainty. To overcome this difficulty, we develop two RMTbased probabilistic models for the Jacobian matri...
متن کاملPreparation of Sustained-Release Matrix Tablets of Aspirin with Ethylcellulose, Eudragit RS100 and Eudragit S100 and Studying the Release Profiles and their Sensitivity to Tablet Hardness
A sustained-release tablet formulation should ideally have a proper release profile insensitive to moderate changes in tablet hardness that is usually encountered in manufacturing. In this study, matrix aspirin (acetylsalicylic acid) tablets with ethylcellulose (EC), Eudragit RS100 (RS), and Eudragit S100 (S) were prepared by direct compression. The release behaviors were then studied in two co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1610.00735 شماره
صفحات -
تاریخ انتشار 2016